Multi-objective evolutionary algorithms for data clustering

نویسنده

  • Oliver Kirkland
چکیده

In this work we investigate the use of Multi-Objective metaheuristics for the data mining task of clustering. We first investigate methods of evaluating the quality of clustering solutions, we then propose a new Multi-Objective clustering algorithm driven by multiple measures of cluster quality and then perform investigations into the performance of different Multi-Objective clustering algorithms. In the context of clustering, a robust measure for evaluating clustering solutions is an important component of an algorithm. These Cluster Quality Measures (CQMs) should rely solely on the structure of the clustering solution. A robust CQM should have three properties: it should be able to reward a “good” clustering solution; it should decrease in value monotonically as the solution quality deteriorates and, it should be able to evaluate clustering solutions with varying numbers of clusters. We review existing CQMs and present an experimental evaluation of their robustness. We find that measures based on connectivity are more robust than other measures for cluster evaluation. We then introduce a new Multi-Objective Clustering algorithm (MOCA). The use of Multi-Objective optimisation in clustering is desirable because it permits the incorporation of multiple measures of cluster quality. Since the definition of what constitutes a good clustering is far from clear, it is beneficial to develop algorithms that allow for multiple CQMs to be accommodated. The selection of the clustering quality measures to use as objectives for MOCA is informed by our previous work with internal evaluation measures. We explain the implementation details and perform experimental work to establish its worth. We compare MOCA with k-means

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-layer Clustering Topology Design in Densely Deployed Wireless Sensor Network using Evolutionary Algorithms

Due to the resource constraint and dynamic parameters, reducing energy consumption became the most important issues of wireless sensor networks topology design. All proposed hierarchy methods cluster a WSN in different cluster layers in one step of evolutionary algorithm usage with complicated parameters which may lead to reducing efficiency and performance. In fact, in WSNs topology, increasin...

متن کامل

An Approach to Reducing Overfitting in FCM with Evolutionary Optimization

Fuzzy clustering methods are conveniently employed in constructing a fuzzy model of a system, but they need to tune some parameters. In this research, FCM is chosen for fuzzy clustering. Parameters such as the number of clusters and the value of fuzzifier significantly influence the extent of generalization of the fuzzy model. These two parameters require tuning to reduce the overfitting in the...

متن کامل

Improved Automatic Clustering Using a Multi-Objective Evolutionary Algorithm With New Validity measure and application to Credit Scoring

In data mining, clustering is one of the important issues for separation and classification with groups like unsupervised data. In this paper, an attempt has been made to improve and optimize the application of clustering heuristic methods such as Genetic, PSO algorithm, Artificial bee colony algorithm, Harmony Search algorithm and Differential Evolution on the unlabeled data of an Iranian bank...

متن کامل

A Multi-Objective Approach to Fuzzy Clustering using ITLBO Algorithm

Data clustering is one of the most important areas of research in data mining and knowledge discovery. Recent research in this area has shown that the best clustering results can be achieved using multi-objective methods. In other words, assuming more than one criterion as objective functions for clustering data can measurably increase the quality of clustering. In this study, a model with two ...

متن کامل

Using Metaheuristic Algorithms Combined with Clustering Approach to Solve a Sustainable Waste Collection Problem

Sustainability is a monumental issue that should be considered in designing a logistics system. In order to incorporate sustainability concepts in our study, a waste collection problem with economic, environmental, and social objective functions was addressed. The first objective function minimized overall costs of the system, including establishment of depots and treatment facilities. Addressi...

متن کامل

خوشه‌بندی خودکار داده‌ها با بهره‌گیری از الگوریتم رقابت استعماری بهبودیافته

Imperialist Competitive Algorithm (ICA) is considered as a prime meta-heuristic algorithm to find the general optimal solution in optimization problems. This paper presents a use of ICA for automatic clustering of huge unlabeled data sets. By using proper structure for each of the chromosomes and the ICA, at run time, the suggested method (ACICA) finds the optimum number of clusters while optim...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014